A Bilingual Graph-Based Semantic Model for Statistical Machine Translation
نویسندگان
چکیده
Rui Wang, Hai Zhao,1,2⇤ Sabine Ploux, ⇤ Bao-Liang Lu, and Masao Utiyama Department of Computer Science and Eng. Key Lab of Shanghai Education Commission for Intelligent Interaction and Cognitive Eng. Shanghai Jiao Tong University, Shanghai, China Centre National de la Recherche Scientifique, CNRS-L2C2, France National Institute of Information and Communications Technology, Kyoto, Japan [email protected], {zhaohai, blu}@cs.sjtu.edu.cn, [email protected], [email protected]
منابع مشابه
A new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملImprove Statistical Machine Translation with Context-Sensitive Bilingual Semantic Embedding Model
We investigate how to improve bilingual embedding which has been successfully used as a feature in phrase-based statistical machine translation (SMT). Despite bilingual embedding’s success, the contextual information, which is of critical importance to translation quality, was ignored in previous work. To employ the contextual information, we propose a simple and memory-efficient model for lear...
متن کاملImproving word alignment for low resource languages using English monolingual SRL
We introduce a new statistical machine translation approach specifically geared to learning translation from low resource languages, that exploits monolingual English semantic parsing to bias inversion transduction grammar (ITG) induction. We show that in contrast to conventional statistical machine translation (SMT) training methods, which rely heavily on phrase memorization, our approach focu...
متن کاملBilingual Correspondence Recursive Autoencoder for Statistical Machine Translation
Learning semantic representations and tree structures of bilingual phrases is beneficial for statistical machine translation. In this paper, we propose a new neural network model called Bilingual Correspondence Recursive Autoencoder (BCorrRAE) to model bilingual phrases in translation. We incorporate word alignments into BCorrRAE to allow it freely access bilingual constraints at different leve...
متن کاملToward Better Chinese Word Segmentation for SMT via Bilingual Constraints
This study investigates on building a better Chinese word segmentation model for statistical machine translation. It aims at leveraging word boundary information, automatically learned by bilingual character-based alignments, to induce a preferable segmentation model. We propose dealing with the induced word boundaries as soft constraints to bias the continuous learning of a supervised CRFs mod...
متن کامل